A two-level Drive - Response Model of Instationary Speech Signals
نویسنده
چکیده
The transmission protocol of voiced speech is hypothesized to be based on a fundamental excitation or drive process, which synchronizes the vocal tract excitation on the transmitter side and evokes the loudness and pitch perception on the receiver side. The fundamental drive can be extracted from the speech signal by using a voice-specific subband decomposition. When used as fundamental drive of a two-level drive response model with stationary coupling on both levels, the instationary drive is able to describe instationary speech as secondary response. For simplicity each subband specific primary response is assumed to be restricted to a nonlinear synchronisation manifold. Whereas the extraction of a physiologically interpretable fundamental phase is limited to voiced sections of speech, the fundamental amplitude can as well be used for the time scale separation of unvoiced sections.
منابع مشابه
A Two-Level Drive - Response Model of Non-stationary Speech Signals
The transmission protocol of voiced speech is hypothesized to be based on a fundamental excitation or drive process, which synchronizes the vocal tract excitation on the transmitter side and evokes the loudness and pitch perception on the receiver side. The fundamental drive can be extracted from the speech signal by using a voice-specific subband decomposition. When used as fundamental drive o...
متن کاملTopologically equivalent reconstruction of instationary, voiced speech
Voiced speech is characterized by qualitatively rich mode locking phenomena linking harmonically excited acoustic modes of the vocal tract. Due to the strong instationarity of speech, a differentiated analysis of these modes cannot be achieved with the help of a linear, time invariant source and filter model (based on stationary sources). As alternative, the characteristic mode locking is descr...
متن کاملA New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain
Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...
متن کاملNeural Network Modeling of Speech and Music Signals
Time series prediction is one of the major applications of neural networks. After a short introduction into the basic theoretical foundations we argue that the iterated prediction of a dynamical system may be interpreted as a model of the system dynamics. By means of RBF neural networks we describe a modeling approach and extend it to be able to model instationary systems. As a practical test f...
متن کاملVoiced excitation as entrained p a reconstructed glottal ma
A time scale separation of voiced speech signals is introduced, which avoids the assumption of a frequency gap between the acoustic response and the prosodic drive. The non-stationary drive is extracted selfconsistently from a voice specific subband decomposition of the speech signal. When the band limited prosodic drive is used as fundamental drive of a two-level drive-response model, the voic...
متن کامل